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Abstract 

Studies in social stratification have used siblings as a tool to learn about the intergenerational transmission of advantage 
but less often have asked how siblings impact one another’s life chances. The author draws on social capital theory 
and hypothesizes that when youths attend college, they increase the probability that their siblings attend college. The 
author further hypothesizes that this effect is strongest among youths whose parents do not have college degrees. 
Findings from a U.S. national probability sample support both hypotheses. Although it is possible that confounding 
factors drive the estimates, the author conducts robustness checks that show that confounding would need to be 


very atypically strong to invalidate a causal interpretation. The positive main effect suggests that an intragenerational 
transmission of educational advantage exists alongside the intergenerational transmission that receives more attention. 
Effect heterogeneity points to the potential redundancy of college-educated siblings’ benefits when youths already 


receive similar benefits from college-educated parents. 
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Studying intersibling effects is important in order to broaden 
scholars’ conceptions of the directions in which educational 
advantage flows and the roles siblings play in one another’s 
educational outcomes. Research in social stratification fre- 
quently analyzes siblings for the purpose of learning about 
intergenerational persistence but less frequently analyzes 
siblings to learn about intragenerational processes whereby 
siblings directly influence one another. Some research uses 
correlations between adult siblings’ socioeconomic charac- 
teristics as proxies for the total impact parents exert on the 
next generation (for a review, see Torche 2015), usually 
without considering that intersibling effects might also drive 
these correlations (cf. Knigge 2015). Other research consid- 
ers how siblings dilute parent-to-child effects, for example, 
by studying the effects of birth order and sibship size (Black, 
Devereux, and Salvanes 2005; Downey 1995; Gratz 2018). 
In birth order and sibship size studies, the role siblings play 
in the other siblings’ educational outcomes is one of dilution; 
under this framework, siblings hamper the educational 
attainment of the other siblings by absorbing parents’ time 
and resources that would otherwise go to a single child. 
Although this research analyzes siblings, the purpose is not 


to study how individuals’ outcomes affect their siblings’ out- 
comes; instead, parents remain front and center. Each sibling 
is considered a passive recipient of parents’ benefits, not an 
active mentor who can provide benefits of their own to sib- 
lings (cf. Roksa 2019). Studies of intersibling effects in edu- 
cation exist (e.g., Benin and Johnson 1984; Loury 2004), but 
they are few. 

The comparative rarity of research on intersibling effects 
is unfortunate because this research can advance knowledge 
of how family members transmit educational advantage. 
Most of what the field knows on this topic relates to intergen- 
erational processes, and sociologists have doubled down on 
this intergenerational focus by studying grandparent and 
great-grandparent influences (Mare 2011; Pfeffer 2014). The 
relationship between siblings’ educational attainments has 
substantive importance because a positive effect would 
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imply that, above and beyond intergenerational forces, there 
exists an intragenerational, intersibling transmission of edu- 
cational advantage. A positive effect in this case would 
deepen stratification scholars’ understanding of the family 
processes that generate educational attainment. Moreover, a 
positive effect would be methodologically important because 
it would inspire caution in estimates of intergenerational per- 
sistence that rely on sibling correlations, because these esti- 
mates would capture not only the effect of parents but also 
the effect of siblings on one another. 

Intersibling effects also would have practical implica- 
tions. Because of a variety of disadvantages that students 
without college-educated parents face (Wilbur and Roscigno 
2016), a large socioeconomic gap exists in attendance at 
institutions of postsecondary education (henceforth, college 
attendance, collapsing two- and four-year institutions unless 
otherwise specified). This problem threatens both equity and 
overall economic prosperity. The U.S. Department of 
Education regularly produces reports on this issue (Cataldi, 
Bennett, and Chen 2018; Choy 2001; Redford and Hoyer 
2017). The consistent conclusion of these reports is that stu- 
dents without college-educated parents attend college at a far 
lower rate than students who have college-educated parents 
and that differences in high school academic achievement 
cannot fully explain this inequality. Accordingly, when for- 
mer president Obama described his ambitious goal for the 
United States to lead the world in college degree production, 
some argued that youths without college-educated parents 
needed to be key targets given their comparatively low col- 
lege participation and large population (Bowen, Chingos, 
and McPherson 2009; Templin 2011). The United States 
attempts to increase these youths’ college participation with 
thousands of precollege outreach programs, 71 percent of 
which specifically target students whose parents are not col- 
lege educated (Swail and Perna 2002). Among those 71 per- 
cent are the federal TRiO programs, a series of federally 
funded programs with an annual budget of nearly $1 billion 
(Falk, Lynch, and Tollestrup 2018). Given the weight the 
United States accords this problem, intersibling effects may 
have practical importance: knowing how siblings affect one 
another can be useful in evaluating the total benefit of inter- 
ventions that target disadvantaged youth because the inter- 
ventions might have spillover effects on participants’ 
siblings. For example, if sibling college attendance makes 
one’s own college attendance more likely, then college 
access programs may promote social mobility among disad- 
vantaged youth who do not participate in the programs but 
who have siblings who do. The overall benefit of college 
access programs, then, is greater than typically assessed 
when studying program participants only. 

Undergraduate education is a particularly fruitful stage to 
examine because this stage is a pivotal step toward upward 
social mobility (Blau and Duncan 1967; Hout 1988; Torche 
2011). A bachelor’s degree, in particular, bestows powerful 
socioeconomic benefits. Hout (1988) found that social 


background is unassociated with occupational class among 
those who have bachelor’s degrees. Torche’s (2011) more 
recent analysis showed that social background is associated 
with occupational class and earnings more weakly among 
those whose highest degree is a bachelor’s degree compared 
to those without a bachelor’s degree (although a strong social 
background gradient emerges among people who have 
advanced degrees), and some have found that positive selec- 
tion of bachelor’s degree holders from modest socioeco- 
nomic backgrounds cannot explain this result (Karlson 2019; 
cf. Zhou 2019). Receiving an undergraduate education is a 
further important outcome because it confers nonpecuniary 
benefits such as happiness (Andersson 2018) and lifetime 
health (Cutler and Lleras-Muney 2006). 

In this study, I estimate the effect of sibling college atten- 
dance on one’s own college attendance. I then examine effect 
heterogeneity by parental education. Informed by Coleman’s 
(1988) concept of social capital, I hypothesize that sibling 
college attendance positively affects one’s own college atten- 
dance and further hypothesize that the effect is weaker 
among individuals with college-educated parents compared 
with those without. The results support both hypotheses. 
Because a vast literature in social stratification shows the 
many shared family and environmental factors that could 
lead siblings to similar educational outcomes, I then execute 
robustness checks to determine the extent of bias that would 
be necessary to invalidate a causal interpretation of my 
estimates. 


The Salience of Siblings 


The sociology of the family is a core subfield of sociology, 
yet family research investigating intersibling influences is 
sparse, especially in relation to education (Davies 2018). 
Most family studies explore romantic relationships or par- 
ent-child relationships, paying little attention to how siblings 
influence one another. An early article (Irish 1964) made 
three claims: (1) few sociologists of the family have 
researched sibling relationships; (2) most sibling studies 
investigate how siblings differ in their relationships with 
other people, such as parents; and (3) more studies on sibling 
relationships would enrich scholarship in the sociology of 
the family. Progress has been slow, though: McHale, 
Updegraff, and Whiteman (2012) offered an updated review 
on sibling relationships research, concluding that “although 
siblings are building blocks of family structure and key play- 
ers in family dynamics, their role has been relatively 
neglected by family scholars” (p. 913). 

The relatively small literature on sibling relationships 
suggests that siblings, especially older siblings, are forma- 
tive in one’s development. Older siblings who rapidly prog- 
ress in their identity development facilitate rapid progress on 
the part of their younger siblings (Wong et al. 2010). Youths’ 
narratives of themselves frequently center around the ways 
they are similar to and different from their siblings (Davies 
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2015). As individuals go through adolescence, they often 
adopt the attitudes and tastes of their older siblings (McHale 
et al. 2001). Younger siblings use their older siblings as mod- 
els for behavior, often mirroring their older siblings in age of 
sexual debut (Widmer 1997), body weight fluctuations 
(Christakis and Fowler 2007), alcohol and drug use (Altonji, 
Cattan, and Ware 2017), cigarette use (Massey and Krohn 
1986), and career paths (Bingley, Lundborg, and Lyk-Jensen 
forthcoming; Schultheiss et al. 2002). Massey and Krohn 
(1986), in fact, showed that adolescents mimic their older 
siblings’ smoking behavior more than their parents’ smoking 
behavior. Even in adulthood, many individuals maintain 
extremely strong ties to their siblings: national data show 
that 60 percent of adults consider at least one sibling to be 
among their closest friends, and 30 percent would call a sib- 
ling first in case of an emergency (White and Riedmann 
1992). The intimacy of sibling relationships is perhaps not 
surprising given that children spend more time with their sib- 
lings than they do with friends, with any other family mem- 
bers, or alone (McHale and Crouter 1996). Moreover, 82 
percent of children live with at least one sibling, a percentage 
greater than the percentage living with a father figure 
(McHale et al. 2012). 


College Outcomes and Sibling College 
Attendance 


Why would sibling college attendance raise the probability 
of one’s own college attendance? One possible reason is that 
college-educated individuals may promote their siblings’ 
college attendance by providing social capital. As conceived 
by Coleman (1988), social capital is an individual’s stock of 
between-person relations that facilitate desired social out- 
comes. Social capital resembles other forms of capital in 
cultivating desired outcomes and having some degree of 
fungibility. This concept has been an important theory 
explaining how parents transmit their educational advan- 
tages to their children (Kim and Schneider 2005) but has 
seldom been applied to intersibling effects. Social capital 
takes three forms that capture, respectively, the information 
channels, social norms, and obligations people share with 
one another. I propose that information channels are particu- 
larly relevant to the present study, and thus, I describe below 
how this form of social capital could explain intersibling 
effects in college attendance. Because the data do not allow 
me to test this mechanism, my description is speculative 
only. 

College-educated individuals may facilitate their siblings’ 
college attendance by providing information that improves 
access to college. Coleman argued that information that 
inheres in social relations can help an individual achieve a 
desired end, such as attending college. Thus, if youths tend to 
lack information about the steps needed to attend college, 
and if college-educated individuals transmit such informa- 
tion to their siblings, then relations with college-educated 


siblings constitute a form of social capital that promotes col- 
lege attendance. 

The weight of the evidence suggests that high school stu- 
dents have large holes in their knowledge of applying to and 
attending college. Avery and Kane (2004) showed that high 
school seniors overestimate college tuition by a factor greater 
than two. The parents of high school students, too, overesti- 
mate college tuition, and parents with less education often 
refuse even to hazard a guess of the price of college (Grodsky 
and Jones 2007). An experiment involving high school stu- 
dents who primarily do not have college-educated parents 
shows that telling them the tuition at nearby colleges sub- 
stantially increases expectations of attending college and 
reduces concerns about costs (Oreopoulos and Dunn 2013). 
Nonpecuniary information barriers also impede college 
attendance. High school seniors’ behavior suggests confu- 
sion about the requisite steps for attending college: among 
12th grade students in one low-income school who (1) 
express in the fall that they want to immediately attend a 
four-year college and (2) have the academic credentials to do 
so, more than 20 percent do not take the SAT, and 35 percent 
do not end up enrolling in a four-year college right away 
(Avery and Kane 2004). Additionally, filing the application 
for federal student aid is complex (Dynarski and Scott- 
Clayton 2006), and experimental evidence suggests that this 
complexity discourages college attendance. In particular, 
low-income, dependent students are 8 percentage points 
more likely to attend college when their parents receive an 
intervention including personalized help filing the federal 
student aid application, a streamlined process for filing, per- 
sonalized estimates of financial aid, and tuition estimates 
(Bettinger et al. 2012). An important insight from this experi- 
ment is that merely listing information in a brochure does not 
affect college attendance, but hands-on help applying this 
information, which college-educated siblings may offer, 
makes a substantial impact. 

Given scarce knowledge about postsecondary education, 
especially among disadvantaged youths and their parents, 
individuals who are in college plausibly provide siblings 
firsthand information and assistance that helps the siblings 
attend college. McDonough’s (1997) interviews with adoles- 
cents and their families support this theory. College-educated 
older siblings are key information sources for younger sib- 
lings as they consider whether and where to attend college. 
In some cases in which older siblings have ample knowledge 
about college attendance but parents have little, older sib- 
lings are the younger siblings’ primary sources of input, sup- 
port, and expertise. Stanton-Salazar and Spina’s (2003) 
qualitative evidence similarly suggests that older siblings 
transmit college-related information to their siblings. 

Why would intersibling effects be stronger for youths 
without college-educated parents? Effect heterogeneity 
would be in harmony with an information channels explana- 
tion: youths with college-educated parents already receive 
college-related information from their parents, and thus most 
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Figure |. (A) College-educated parents and siblings both provide information that promotes college attendance, but the information 
largely overlaps. (B) Parents without college educations can provide little information about college, and therefore college-educated 


siblings provide information of novel value to their siblings. 


information from siblings is likely redundant, whereas infor- 
mation from college-educated siblings is likely more novel 
for youths without college-educated parents. Figure | illus- 
trates this idea. Parents and siblings both may provide infor- 
mation about college, and this information is much more 
extensive if they attended college. However, if both parties 
have attended college, the information each provides should 
largely overlap with the information the other provides 
(Figure 1A). Thus, although a college-educated sibling prob- 
ably provides some additional insights because he or she 
typically has attended college more recently than the parents, 
much of the information this sibling offers likely is redun- 
dant with information from college-educated parents. In con- 
trast, if no parents have attended college, most of the 


information the sibling provides should be novel information 
that the parents cannot offer (Figure 1B). Consequently, the 
impact of a college-educated sibling should be stronger for 
those without college-educated parents. 


Prior Empirical Studies 


Just a handful of studies have asked how siblings influence 
one another’s educational attainment. Analyzing data from 
Nebraska, Benin and Johnson (1984) found a conditional 
association between siblings’ educational attainments. They 
claimed that this association reflects causal intersibling 
effects because the association is strongest among brother 
pairs and weakest among older sister—younger brother pairs, 
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a form of heterogeneity that captures how sibling role model- 
ing is most influential among same-sex siblings. Hauser and 
Wong (1989) also found particularly weak associations 
between older sisters’ and younger brothers’ educational 
attainments in Michigan and Nebraska. Loury (2004) studied 
a sample of African Americans in the baby boomer genera- 
tion and found that sibling college attendance increases the 
odds of individuals’ own college attendance. More recent 
work applying regression adjustment to data on SAT takers 
revealed a positive effect of sibling college attendance 
(Goodman et al. 2015). However, that study suffered from 
the limitation that SAT takers are a positively selected sam- 
ple, especially of the population without college-educated 
parents, and therefore the results may have missed especially 
strong intersibling effects among those expected to be less 
inclined to attend college in the first place. All of these stud- 
ies are in harmony with recent literature in economics that 
documents sibling spillovers in academic test scores 
(Karbownik and Ozek 2019; Qureshi 2018), though this lit- 
erature does not consider educational attainment. 


Contributions of the Present Study 


In this study I test two hypotheses: (1) sibling college atten- 
dance increases the probability of one’s own college atten- 
dance, and (2) the effect of sibling college attendance on 
one’s own college attendance is greater among those whose 
parents do not have college degrees compared with those 
whose parents do. I test these hypotheses using data from the 
High School Longitudinal Study of 2009 (HSLS), which 
provides a large national probability sample. Prior literature 
germane to either hypothesis is sparse, but hypothesis 2 is 
especially understudied. Furthermore, to my knowledge, no 
published study has estimated intersibling effects on college 
attendance, or parental education—based effect heterogeneity 
therein, using data representative of the full youth population 
in the United States. Within the small set of prior studies 
investigating intersibling effects on educational attainment, 
each has used samples heavily restricted with respect to geo- 
graphic region, race, or propensity to attend college, with 
most studies using much older data. In sum, this study con- 
tributes to prior literature by assessing how intersibling 
effects vary with respect to parental education and doing so 
using data that are unique in being both nationally represen- 
tative and recent. 


Methods 


Data 


I analyze data from all waves of the HSLS, a study with three 
interview waves (2009, 2012, and 2016) plus a high school 
transcript collection in 2013. The 23,000 HSLS participants 
make up a probability sample of U.S. youth who were in ninth 
grade in the fall of 2009. There exist several large-scale, 


longitudinal studies with national probability samples of 
youth, but HSLS has important advantages: it is among those 
with the largest sample sizes, and it is the most recent. These 
properties of the data aid in generalizing to the secondary stu- 
dent population of today. 


Measures. The outcome of interest is whether the respondent 
ever attended college. I code this outcome as a binary out- 
come equal to | if the respondent had ever enrolled in an 
institution of postsecondary education by the last wave of 
data and equal to 0 otherwise. This measure comes from the 
last wave, during which respondents indicated the last month 
and year that they were enrolled in an institution of postsec- 
ondary education. If the respondent listed a date, I code the 
outcome as |, and if the respondent indicated that he or she 
had never enrolled in an institution of postsecondary educa- 
tion, I code the outcome as 0. 

The chief independent variable, or treatment, is whether 
the respondent has a sibling who attended college. I code 
treatment status according to respondents’ self-reports of 
siblings’ college attendance. In particular, respondents who 
had not attended college answered the question “Do you 
have any brothers or sisters who had started college or trade 
school by the end of February 2016?” and respondents who 
had attended college answered the question “Do you have 
any brothers or sisters who started college or trade school 
before you did?” I code the treatment as | if the respondent 
answered yes and 0 if the respondent answered no.! Thus, if 
a respondent has a sibling who attended college but this sib- 
ling started college after the respondent, I consider the 
respondent to be in the nontreated group. This coding 
accords with principles of causal ordering: if the respondent 
started college first, then the sibling’s subsequent college 
attendance could not possibly have caused the respondent to 
attend college. 

The moderator variable is whether the respondent has a 
parent with a postsecondary degree or certificate. If the 
respondent reports that the highest level of education com- 
pleted by both the mother or female guardian and the father or 
male guardian is a high school diploma or less, I classify the 
respondent as a would-be first-generation college student. For 
brevity, I say that these students are first-generation. If the 
respondent reports that either parent or guardian has attained 
a postsecondary credential, I classify the respondent as a 
would-be continuing-generation college student. I say that 
these respondents are continuing generation. Postsecondary 
credentials include bachelor’s degrees, associate’s degrees, 


'The available measure of sibling college attendance does not allow 
differentiation between siblings who attended a four-year college 
and siblings who attended a two-year college, nor does the measure 
allow differentiation between part-time and full-time enrollment. I 
choose to measure the outcome variable, detailed above, in a simi- 
larly broad fashion in order for the outcome to optimally mirror the 
independent variable. 
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certificates or diplomas from schools providing occupational 
training, and advanced degrees.” 

Respondents whose siblings attend college are likely dif- 
ferent from other respondents in a host of ways that predict 
whether the respondents themselves attend college. Observed 
differences between the college attendance rates of individu- 
als with and without college-educated siblings, then, may 
reflect the influence of confounding factors rather than a true 
causal relationship. To ameliorate this problem, I control for 
several factors that may have influenced the siblings’ college 
attendance. For each respondent, I control for sex, race, 
father’s and mother’s years of schooling,’ whether the father 
is unemployed, whether the mother is unemployed, family 
income, number of siblings, number of household members, 
parental marital status, parent age, hours spent with family on 
a typical school day, parent’s educational expectations of the 
respondent, whether the respondent is a member of a religious 
group, standardized math test score, cumulative high school 
grade point average (GPA), the highest level of high school 
math completed by the respondent, number of Advanced 
Placement (AP) or International Baccalaureate (IB) courses 
taken by the respondent, how often the respondent is late to 
class, how often the respondent fails to finish assigned home- 
work, how often the respondent is absent from school, hours 
spent on homework on a typical school day, the respondent’s 
effort in math class, the respondent’s effort in science class, 
the respondent’s region of the country, and the respondent’s 
locale type. Table A.1 in Online Appendix A details the coding 
procedures for all measures analyzed in this study. Some of the 
listed control variables, such as effort in math class, present an 
ambiguous case in which it is unclear whether the variables 
are mediators or confounders, a pervasive problem in social 
science (King 2010). On the one hand, if the sibling’s college 


°The data do not allow me to distinguish between parents who never 
attended college and those who started but did not attain a creden- 
tial. On the basis of my own calculations using Current Population 
Survey data (available at https://www.census.gov/cps/data/cpsta- 
blecreator.html), in 2009, about 17 percent of U.S. residents 35 
to 54 years of age (a plausible age range for the parents of HSLS 
respondents) had attended college without receiving a postsecond- 
ary degree. Therefore, especially if information channels from 
people who have applied to and attended college drive youths’ col- 
lege attendance, the available measure of parental education likely 
understates effect heterogeneity between youths who have parents 
with postsecondary degrees and youths whose parents have never 
attended college, as parents who attended but did not complete col- 
lege can provide some of the same information that parents with 
college degrees provide. 

>This covariate adjusts for the remaining variation in parental edu- 
cation after stratifying the sample into the first-generation and con- 
tinuing-generation subsamples. For example, the control variables 
help distinguish a continuing-generation student whose father has 
a bachelor’s degree and whose mother has a high school diploma 
from a continuing-generation student for whom both parents have 
advanced degrees. 
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Figure 2. Causal model describing the relationship between 
sibling college attendance C, and respondent college attendance C.. 
Note. X,., captures all measured confounders that influence both siblings’ 
college attendance, and U, , captures unmeasured confounders. U, 
represents the factors that influence the sibling’s college attendance but 
are not directly related to that of the respondent. U_ represents the 
factors that influence the respondent’s college attendance but are not 
directly related to that of the sibling. Arrows denote causal relationships 
and the dashed line denotes a correlation. 


attendance motivates the respondent to put effort into math 
class, then controlling for this effort may induce overcontrol 
bias as well as endogenous selection bias (Elwert and Winship 
2014). On the other hand, if effort in math class measured in 
the second wave captures respondent characteristics not attrib- 
utable to sibling college attendance as measured during the 
final wave, then controlling for effort reduces omitted variable 
bias. The data do not indicate when the sibling first attended 
college, so it is unclear whether effort in math class precedes 
sibling college attendance. Given the real concern for 
upward bias due to unobserved confounding, I choose to err 
on the side of more control variables. However, for refer- 
ence, I also report the estimates from models that omit 
covariates that are potentially posttreatment. These models 
yield estimates that are qualitatively the same but quantita- 
tively less conservative compared with those I will present 
in my main documentation. 


Analytic Strategy 


Estimating the effect of sibling college attendance on one’s 
own college attendance is challenging because siblings usu- 
ally, though not always, share environmental and genetic fea- 
tures that predispose them to attend or not attend college. 
According to the causal model in Figure 2, sibling college 
attendance, C,, increases the probability of respondent col- 
lege attendance, C,. Sibling college attendance and respon- 
dent college attendance also are both functions of observed 
confounders X,, that jointly influence both siblings’ college 
attendance. Elements of X,, are the control variables listed in 
the previous section, such as family income and hours spent 
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with family. In addition to observed confounders, unob- 
served confounders U,,, also jointly influence siblings’ col- 
lege attendance. Elements of U,, may include grandparental 
wealth, educational backgrounds of adults in the neighbor- 
hood, shared genes that predict educational attainment, and 
family processes such as bedtime reading that vary above 
and beyond factors in X, .. Finally, each sibling has a set of 
factors that influences his or her college attendance but is not 
directly related to the other sibling’s college attendance (U, 
for the respondent and U, for the sibling). Examples may 
include teachers (Chetty, Friedman, and Rockoff 2014) and 
genes (Domingue et al. 2015) that the siblings do not share. 
This set of idiosyncratic factors is the engine of intersibling 
effects because it provides the variation in the sibling’s col- 
lege attendance that is not due to shared influences; in turn, 
this variation allows sibling college attendance indepen- 
dently to cause changes in respondent college attendance. 
Take the concrete example of genes. Some genotypic fea- 
tures can help predict educational attainment, and within full 
biological sibling pairs, these features are allocated randomly 
at conception (Fletcher and Lehrer 2011). Thus, the exoge- 
nous component of the sibling’s genes can cause changes in 
the respondent’s college attendance that operate only by way 
of the sibling’s college attendance. 

To obtain plausible causal estimates, I aim to minimize 
the unobserved confounders U,, so that I compare the out- 
comes of respondents who are ‘similar in as many ways as 
possible, except that one of the respondents has a sibling who 
attended college and the other has a sibling who did not. To 
this end, I control for observed factors X, , using the inverse 
probability-weighted regression adjustment (IPWRA) esti- 
mator. IPWRA estimates a treatment model that includes 
each individual’s observed characteristics. The treatment 
model yields a propensity score for each individual, repre- 
senting the probability that someone with his or her charac- 
teristics receives the treatment. Next, IPWRA estimates an 
outcome model that includes observed characteristics while 
also weighting cases by their inverse probability weights; for 
individuals with a college-educated sibling, the inverse prob- 
ability weight is the inverse of the propensity score, and for 
individuals without a college-educated sibling, the inverse 
probability weight is the inverse of 1 minus the propensity 
score. This model generates predicted probabilities of col- 
lege attendance for all respondents, and the difference in 
average predicted probabilities between those with and with- 
out a college-educated sibling provides the point estimate. I 
apply this estimator to first-generation and continuing-gener- 
ation sibling sets separately. For each subgroup, I use a logis- 
tic regression model for both the treatment and outcome 
models and include all control variables listed above.* Even 
though I use logistic regression models for both the treatment 


‘Results (available upon request) are virtually identical when using 
a linear probability model instead of a logistic regression model to 
generate predicted probabilities of respondent college attendance. 


and outcome, IPWRA point estimates are always expressed 
in probability units because they come from subtracting 
average predicted probabilities. I use base wave—final wave 
panel weights and robust standard errors to account for the 
complex design of HSLS. Because I include panel weights, I 
weight each case not merely by its inverse probability weight 
but rather by the product of its inverse probability weight and 
panel weight. In Online Appendix A, I check two of IPWRA’s 
undergirding assumptions—covariate balance and positiv- 
ity—and find support for both in my case. 

IPWRA is gaining favor in the social sciences because it 
is consistent if either the treatment model or the outcome 
model is correctly specified (Wooldridge 2007). This doubly 
robust property offers the advantages of other propensity 
score methods (such as matching) as well as traditional 
regression adjustment techniques. As with propensity score 
matching, proper assumptions about the functional form 
between covariates and respondent college attendance are 
not required for consistent estimation, because inverse prob- 
ability weights have the potential to balance the treated and 
nontreated samples without these parametric assumptions 
(Thoemmes and Ong 2016). If the relationship between 
covariates and respondent college attendance is properly 
specified, though, IPWRA retains the ordinary least squares 
regression advantage of being consistent without needing to 
properly model the relationship between covariates and sib- 
ling college attendance. The latter advantage is attractive for 
this study because HSLS measures no sibling characteristics 
besides college attendance, and therefore the outcome model 
is more likely to be correctly specified than the propensity 
score model, which estimates sibling college attendance 
using only characteristics of respondents and their parents. 


Missing Data 


The target population contains U.S. ninth graders from the 
fall of 2009 who, in 2016, had siblings who were of tradi- 
tional college age or older. The initial HSLS sample contains 
23,503 individuals, but I drop cases from the sample in two 
steps. First, I drop 13,899 cases that appear outside the target 
population on account of being singletons, being the oldest 
of their siblings,° or having provided no data on whether they 
have siblings. Second, I drop 2,087 cases that are in the target 
population but missing in the outcome, either because of unit 


5 drop eldest children because most such respondents are unlikely 
to have had younger siblings who had attended college before the 
respondents, with enough time to influence the respondents’ own 
decisions to attend college. Respondents were about 21 years old 
during the final survey wave, thus they were at most about 20 years 
old when their younger siblings might have started college in time 
to influence the eldest child’s decision to start college the following 
year. Because I do not know the ages of the younger siblings, I drop 
eldest child cases in order not to assume that the respondent had a 
college-aged younger sibling the year before the final wave. 
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nonresponse in the final wave or item nonresponse on the 
item about the last postsecondary enrollment.® This leaves an 
analytic sample size of 7,517 individuals, about 39 percent of 
whom are first generation. To address missingness in other 
measures, I perform multiple imputation by chained equa- 
tions with five imputations. 


Results 


Descriptive Statistics by Subgroup 


Table | shows means and standard deviations of each mea- 
sure, by treatment status and parental education group. The 
raw association between sibling college attendance and the 
respondent’s college attendance is strong: 91 percent of those 
with siblings who attended college themselves attend col- 
lege, compared with 54 percent among those who have at 
least one older sibling but no siblings that attended college. 
Table 1 also demonstrates how those with siblings who 
attended college are advantaged with respect to socioeco- 
nomic, academic, and geographic factors that predict college 
attendance, such as family income, parental employment, 
math test scores, high school GPA, AP or IB participation, 
location in the Northeast, and suburbanicity. In sum, descrip- 
tive statistics show a relationship between siblings’ college 
attendance but also demonstrate the need to control for a host 
of factors on which those with and without college-educated 
siblings differ if one wishes to test a causal relationship. 


Effect of Sibling College Attendance 


The findings support the theory that, among first-generation 
sibling sets, sibling college attendance raises the probability 
that an individual attends college. Figure 3 shows sibling 
college attendance effect estimates on respondent college 
attendance. Sibling college attendance raises the probability 
of first-generation individuals’ college attendance by an esti- 
mated 28 percentage points. The estimate is quite precise 
with a narrow 95 percent confidence interval (0.23-0.33) 
that is far from including zero. As reference for the practical 
significance of this estimate, note that the effect size exceeds 
the impact of Head Start participation on college attendance 
by a factor of about 5 (Ludwig and Miller 2005). Similarly, 


®Comparing the retained 7,517 cases with the dropped 2,087 cases 
with respect to observed covariates, it seems likely that retained 
cases have a higher college attendance rate than the dropped cases; 
retained cases have greater academic achievement as measured by 
factors such as high school GPA, as well as higher family incomes 
and a greater likelihood of being female (full table of standardized 
differences available upon request). Therefore, to the extent that 
individuals with greater initial propensities to attend college are less 
sensitive to their siblings’ college attendance, omitting cases miss- 
ing in the outcome probably leads to downward bias when estimat- 
ing the effect of sibling college attendance on respondent college 
attendance. 


the unconditional female advantage in college attendance, 
which rose meteorically over the second half of the twentieth 
century (Buchmann and DiPrete 2006), is less than a third of 
the magnitude of the estimated sibling college attendance 
effect, on the basis of the present data source (8 percentage 
point advantage in college attendance rates for women com- 
pared with men). 

Findings among the continuing-generation sample also 
support a causal link between siblings’ college attendance. 
Sibling college attendance raises the probability of continu- 
ing-generation individuals’ college attendance by an esti- 
mated 14 percentage points.’ The narrow 95 percent 
confidence interval (0.10—0.18) does not include zero, so 
sampling error is not a likely explanation for the positive 
estimate. 

The estimated effect of sibling college attendance among 
the first-generation sample is nearly twice the magnitude of 
the estimated effect among the continuing-generation sam- 
ple, suggesting that sibling college attendance may well have 
a greater impact on first-generation people than on continu- 
ing-generation people.’ Moreover, because the 95 percent 
confidence intervals of the two estimates do not overlap, it is 
unlikely that the difference in magnitude is due to sampling 
error (Figure 3).? Previous research not using national prob- 
ability samples has shown a positive main effect of sibling 


7T obtain alternative estimates by omitting all covariates that I con- 
sider potentially posttreatment: math score, GPA, highest math 
course, number of AP or IB courses, class tardies, school absences, 
frequency of failing to finish homework, time on homework, effort 
in math, and effort in science. For the first-generation sample, the 
estimate is 0.35 (greater by a factor of 1.25), and for the continuing- 
generation sample, the estimate is 0.19 (greater by a factor of 1.36). 
Qualitative conclusions remain the same, but unsurprisingly, the 
estimates are quantitatively greater than when controlling for the 
potentially posttreatment measures. 

‘It is possible that the apparent effect heterogeneity reflects dif- 
ferential sensitivity to unobserved confounders. Although I cannot 
directly test this proposition, I use each parental education group’s 
sensitivity to observed covariates as a proxy for their sensitivity 
to unobserved confounders. | first run a model predicting college 
attendance, separately for each parental education group. I then 
divide the coefficient for each predictor from the first-generation 
sample model by the corresponding continuing-generation coeffi- 
cient. Next, I compute the average of the absolute values of the 
ratios, weighted by the ¢ statistics to give more weight to covariates 
with stronger impacts. I find that observed covariates are, on aver- 
age, 1.4 times stronger in determining the first-generation sample’s 
college attendance compared with that of the continuing-genera- 
tion sample. If the twice-as-great estimate for the first-generation 
sample were due only to differential sensitivity, one would expect 
observed covariates to be twice as strong for the first-generation 
sample. Thus, these results are not consistent with the proposition 
that differential sensitivity is the sole reason the estimated effect of 
sibling college attendance is greater among first-generation youths. 
*Because the unconditional college attendance rate for the con- 
tinuing-generation sample (0.89) is so much closer to 1 than is the 
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Table I. Means (Standard Deviations) of Each Measure Used in the Study, Separated by Treatment Status. 


All Treated* Nontreated? 
Attended college ST ~Al 4 
Sibling attended college 65 | 0 
First generation 37 25 54 
Female FS) 52 52 
Non-Hispanic white 7 62 52 
Non-Hispanic black ll .09 13 
Hispanic 16 13 2 
Non-Hispanic Asian 06 .08 .03 
Other/multiple race, non-Hispanic él .09 Al 
Father’s years of schooling 13.92 (2.79) 14.65 (2.85) 12.88 (2.36) 
Mother’s years of schooling 13.74 (2.48) 14.42 (2.48) 12.89 (2.16) 
Father unemployed AS all 2 
Mother unemployed 225 22 27 
Family Income ($1,000s) 78.4 (62.3) 94.6 (65.5) 55.9 (46.9) 
Number of siblings 2.62 (1.79) 2.4 (1.59) 2.8 (1.95) 
Number of household members 4.26 (1.5) 4.27 (1.45) 4.22 (1.56) 
Parent married 75 82 .65 
Parent divorced 14 li 18 
Parent separated .04 02 .04 
Parent never married .06 .03 10 
Parent widowed 02 02 .03 
Parent’s age 44.99 (6.60) 46.02 (5.76) 43.82 (7.37) 
Family time 3.06 (1.95) 2.96 (1.89) 3.16 (2.01) 
Parent’s educational expectations for child 16.78 (2.47) 17.30 (2.11) 16.21 (2.73) 
Religious group participant 55) 6 49 
Math test score .09 (.96) .38 (.9) —.18 (.93) 
High school GPA 2.78 (.84) 3.07 (.71) 2.52 (.84) 
No or low math .08 .03 Al 
Mid—academic | math .28 2 37 
Mid—academic 2 math 25) 3] 9 
Advanced academic math 34 A2 .28 
Number of AP/IB courses 1.25 (2.3) 1.79 (2.63) 74 (1.8) 
Never late to class 45 A8 43 
Rarely late to class 38 39 38 
Sometimes late to class ll .08 12 
Often late to class 02 01 .03 
Never fails to finish homework 9 .23 7 
Rarely fails to finish homework 42 A6 A 
Sometimes fails to finish homework 25 2 .28 
Often fails to finish homework | .09 12 
Number of absences 3.49 (3.31) 3.1 (3.07) 3.95 (3.57) 
Time on homework 1.06 (.84) 1.07 (.83) 1.05 (.87) 
Effort in math .07 (.96) .16 (.88) -.02 (1.04) 
Effort in science .07 (.94) .13 (.88) Ol (1) 
Northeast 16 7 13 
Midwest .28 29 26 
South 39 38 Al 
West 18 16 2, 
City .28 3! 25 
Suburb .36 37 35 
Town 12 ll 12 
Rural .24 22 .28 


*Sibling attended college. 
>No sibling attended college. 
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Figure 3. Estimates of the effect of sibling college attendance 
on the probability of respondent college attendance, by parental 
education group. 

Note. Dots and bars represent the point estimates and 95 percent 
confidence intervals in the numerical expressions. Estimates come from 
inverse probability-weighted regression adjustment and are in probability 
units. 


college attendance on individuals’ own college attendance 
(Goodman et al. 2015; Loury 2004). My findings suggest 
that first-generation siblings see the strongest effects at the 
national level. Thus, efforts to increase the college atten- 
dance rate may be most efficient if they target first-genera- 
tion individuals because they are likely to induce the greatest 
spillover effects onto their siblings. 

The results are consistent with the theory depicted in 
Figure 1, even though the results cannot definitively prove it. 
The theory predicts a weaker effect among continuing-gener- 
ation youths because of the partially redundant nature of col- 
lege-educated siblings’ social capital when the respondent 
already receives the same benefits from college-educated par- 
ents. One can easily see why there might be diminishing 
returns to significant people explaining the process of apply- 
ing to college and applying for financial aid: conceivably, it is 


rate for the first-generation sample (0.65), effect heterogeneity as 
measured by the IPWRA estimator captures both differences in the 
effect of sibling college attendance on respondents’ latent propensi- 
ties to attend college as well as differences in how proximate each 
sample is to the ceiling of college attendance probability. In analy- 
ses available upon request, I estimate a logistic regression model of 
college attendance with each control variable listed above and with 
an interaction term between sibling college attendance and first- 
generation status. The results reaffirm a positive treatment effect in 
both groups and a greater effect among the first-generation sample, 
although the interaction term is not statistically significant. Thus, 
even though, in an absolute sense (most relevant to the present 
study), sibling college attendance has a much greater effect on the 
probability of respondent college attendance in the first-generation 
sample than in the continuing-generation sample, the cross-group 
difference in how sibling college attendance affects the latent pro- 
pensity to attend college is a bit less marked. 


enough to have one trusted point person, and for continuing- 
generation youths this point person is likely to be the parent, 
even in the presence of college-educated siblings. College- 
educated siblings may provide some extra, more up-to-date 
information, but much of the information they offer is likely 
to overlap with information that college-educated parents 
already provide. 


Robustness to Unobserved Confounders 


The estimated effect of sibling college attendance may reflect 
the influence of unobserved confounders, such as grandpa- 
rental wealth, and/or may reflect a genuine causal relation- 
ship. Even if sibling college attendance has a positive causal 
effect on youths’ own college attendance, estimates that are 
not (quasi)-experiment almost certainly overstate the impact 
because of unobserved factors that lead both siblings to 
attend college. Whether bias due to unobserved confounding 
is present is not a very helpful question, because these esti- 
mates cannot realistically adjust for every confounding fac- 
tor. Frank et al. (2013) proposed the more useful question of 
how much bias must exist to invalidate a researcher’s claim 
that one variable causes another. I use the robustness check 
developed by Frank et al. to assess the extent of bias that 
would need to be present to invalidate my causal claims.'° 

For the first-generation sample, I find that in order to 
drive the estimate down to statistical insignificance at the 
0.05 level, 83 percent of cases would need to be replaced 
with cases for which there is an effect of zero. This is quite 
a high percentage given that unobserved confounders can 
only bias my estimates via variation that is orthogonal to 
observed covariates (e.g., the variation in grandparental 
wealth that is unrelated to parental education, family 
income, test scores). Nevertheless, the nature of observa- 
tional estimates and the infeasibility of random assignment 
mean that I cannot rule out the possibility that siblings share 
unobserved influences that are this strong. For the continu- 
ing-generation sample, the percentage is slightly lower, but 
still high, at 75 percent. 

To put the robustness of these estimates further in context, 
I make two comparisons. First, I compare the robustness of 
my estimates with the robustness of estimates in the 10 
observational studies that Frank et al. (2013) examined, 
which made up all of the studies then available online first 
for publication in Educational Evaluation and Policy 
Analysis. The estimates in these studies required that between 
2 percent and 60 percent of the corresponding estimates be 


Because the ultimate point estimates are linear (in probability 
units), I execute Frank et al.’s robustness check as one would for a 
linear regression coefficient. In particular, I use the pkonfound com- 
mand in Stata and input each relevant point estimate and standard 
error derived from IPWRA, along with the number of covariates 
(43, the number of covariates in both the outcome and treatment 
models). 
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due to bias. The present percentages of 83 percent and 75 
percent substantially exceed the maximum from these stud- 
ies. Second, I compare the bias necessary to invalidate my 
inferences to the bias accounted for by the strongest observed 
covariate (high school GPA'!). Supplementary analysis of 
the first-generation sample shows that controlling for GPA 
reduces the estimate by 3.4 percent. Because 83 percent is 24 
times greater than 3.4, a statistically insignificant effect 
would require that there be 24 times more bias than is 
accounted for by high school GPA. Supplementary analysis 
of the continuing-generation sample shows that controlling 
for GPA reduces the estimate by 7.7 percent, and thus a sta- 
tistically insignificant effect would require that there be 9.7 
times more bias than is accounted for by high school GPA. 


Discussion 


I find that sibling college attendance increases the probabil- 
ity of an individual’s own college attendance. My study is the 
first to garner such evidence using data representative of the 
full youth population in the United States. Furthermore, my 
study is the first to uncover a more pronounced effect of sib- 
ling college attendance among first-generation sibling sets 
compared with continuing-generation sibling sets. This 
effect heterogeneity points to the potential redundancy of 
college-educated siblings’ benefits when youths already 
receive similar benefits from college-educated parents. The 
findings come from the HSLS and causal inference methods 
designed to be robust against confounding factors. 

The findings complement stratification research by sug- 
gesting that educational (dis)advantage flows intragenera- 
tionally in addition to the intergenerational flow that receives 
comparatively more attention. Research on socioeconomic 
persistence largely centers parents (Mare 2011), with a recent 
uptick in studies about grandparents and great-grandparents 
(Pfeffer 2014). This study suggests that there are yet other 
family members who causally affect individuals’ educational 
attainment and, thus, is a timely companion to the new exten- 
sions of intergenerational research. Simultaneously, this 
study advances the sociology of the family by addressing 
early (Irish 1964) and recent (McHale et al. 2012) calls to 
give intersibling effects more attention. 

Efforts to curb social reproduction benefit from knowl- 
edge of intersibling effects. Harvill et al. (2012) conducted a 
meta-analysis of college access program evaluations and 
found that, according to experimental studies, college access 
programs cause a 4 percentage point increase in participants’ 
probability of attending college. However, these experiments 


"High school GPA is the most important covariate not only in 
my study but in a wealth of studies related to college attendance. 
Research consistently finds that high school GPA is the stron- 
gest predictor of college attendance (Bui and Rush 2016; Patrick, 
Schulenberg, and O’Malley 2016), surpassing important factors 
such as standardized test scores and parental education. 


compare the outcomes of only the treatment and control 
groups, without also comparing the outcomes of each group’s 
siblings. If college access programs indirectly increase sib- 
ling college attendance by increasing participant college 
attendance, then one underestimates the total benefit of such 
programs when not considering spillover effects on siblings. 
These spillover effects are perhaps most relevant in cost- 
benefit analyses of programs, for which precise estimates of 
program benefits are crucial (Loury 2004). High school 
administrators, counselors, and nonprofit officers stand to 
benefit from this knowledge because it helps them evaluate 
the programs they are funding and implementing. More par- 
ticularly, the results of this study suggest that the programs 
deserve more credit than these stakeholders tend to give. On 
a broader scale, given that intersibling effects appear more 
prevalent among first-generation than continuing-generation 
sibling sets, and given that first-generation individuals have 
more siblings to begin with, interventions that cause secular 
increases in college attendance in one birth cohort can nar- 
row inequality in the long term by causing especially great 
increases in college attendance among that birth cohort’s 
first-generation siblings. This process may be one of many 
reasons that overall educational expansion promotes social 
mobility (Breen 2010). 

The results of this study suggest that intersibling effects 
are sources of adult siblings’ socioeconomic correlations 
and, therefore, that intersibling effects can pollute estimates 
of intergenerational persistence when these estimates rely on 
sibling correlations. In her review of the intergenerational 
mobility literature, Torche (2015) explained that intergenera- 
tional persistence estimates from sibling-to-sibling socioeco- 
nomic correlations tend to be greater than estimates from 
parent-to-child correlations. Reflecting the traditional view, 
she argued that this is because the former also capture shared 
community influences and all unmeasured parent influences, 
but she did not attribute the differences to intersibling effects. 
My findings suggest that intersibling effects also swell sib- 
ling-to-sibling socioeconomic correlations. Thus, although 
sibling-to-sibling socioeconomic correlations can capture 
the combined impact of siblings’ shared influences, includ- 
ing their influences on one another, researchers should be 
cautious interpreting these correlations as anything but upper 
bounds on how much socioeconomic advantage flows 
directly from one generation to the next. 

At least a few directions for future research arise from this 
study. To explore mechanisms, future research might build 
on both the present, quantitative evidence of intersibling 
effects and the exclusively qualitative evidence suggesting 
that individuals give their siblings college-related informa- 
tion (McDonough 1997; Stanton-Salazar and Spina 2003). 
Survey measures of how much youths receive this informa- 
tion from their siblings would allow mediation analyses that 
could adjudicate between an information channels explana- 
tion and other explanations for intersibling effects in college 
attendance. In addition, future experimental studies of 
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college access programs can make two related, simultaneous 
advances in our understanding of intersibling effects. First, 
provided the programs are effective, these studies can obtain 
more plausibly exogenous estimates of intersibling college 
attendance effects by using program assignment as an instru- 
ment for sibling college attendance. Second, these studies 
can provide a thorough picture of how much college access 
programs affect college attendance by estimating spillover 
effects on program participants’ siblings. 
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